Aspect Extraction through Semi-Supervised Modeling

نویسندگان

  • Arjun Mukherjee
  • Bing Liu
چکیده

Aspect extraction is a central problem in sentiment analysis. Current methods either extract aspects without categorizing them, or extract and categorize them using unsupervised topic modeling. By categorizing, we mean the synonymous aspects should be clustered into the same category. In this paper, we solve the problem in a different setting where the user provides some seed words for a few aspect categories and the model extracts and clusters aspect terms into categories simultaneously. This setting is important because categorizing aspects is a subjective task. For different application purposes, different categorizations may be needed. Some form of user guidance is desired. In this paper, we propose two statistical models to solve this seeded problem, which aim to discover exactly what the user wants. Our experimental results show that the two proposed models are indeed able to perform the task effectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Graphs of Classifiers to Impose Constraints on Semi-supervised Relation Extraction

We propose a general approach to modeling semi-supervised learning constraints on unlabeled data. Both traditional supervised classification tasks and many natural semisupervised learning heuristics can be approximated by specifying the desired outcome of walks through a graph of classifiers. We demonstrate the modeling capability of this approach in the task of relation extraction, and experim...

متن کامل

Using Graphs of Classifiers to Impose Declarative Constraints on Semi-supervised Learning

We propose a general approach to modeling semisupervised learning (SSL) algorithms. Specifically, we present a declarative language for modeling both traditional supervised classification tasks and many SSL heuristics, including both well-known heuristics such as co-training and novel domainspecific heuristics. In addition to representing individual SSL heuristics, we show that multiple heurist...

متن کامل

A Framework for Entailed Relation Recognition

We define the problem of recognizing entailed relations – given an open set of relations, find all occurrences of the relations of interest in a given document set – and pose it as a challenge to scalable information extraction and retrieval. Existing approaches to relation recognition do not address well problems with an open set of relations and a need for high recall: supervised methods are ...

متن کامل

Automatic Audio Tagging and Retrieval Using Semi-Surpervised Canonical Density Estimation

We apply SSCDE (semi-supervised canonical density estimation), a semi-supervised learning method based on topic modeling, to audio tagging and retrieval problems. SSCDE was originally proposed as an image annotaion and retireval method, but it can also be applied to audio data. The SSCDE method consists of two parts: 1) extraction of a low-dimentional latent space representing topics of sounds ...

متن کامل

SMACD: Semi-supervised Multi-Aspect Community Detection

Community detection in real-world graphs has been shown to benefit from using multi-aspect information, e.g., in the form of “means of communication” between nodes in the network. An orthogonal line of work, broadly construed as semi-supervised learning, approaches the problem by introducing a small percentage of node assignments to communities and propagates that knowledge throughout the graph...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012